Thank you for coming to my talk. It's always a treat to be able to do this. I've had the
opportunity to do a lot of really cool things in my career and with bots, but the one thing
that gave me more satisfaction than anything else I've ever done is the time I wrote a
botnet that purchased millions of dollars worth of cars and defeated the Russian hackers.
So let's have some fun with this, all right? I'm going to tell you a story that involves
hacking. It involves cars. I like cars. It involves Russian hackers, which is pretty cool.
And more than anything else, it involves screwing with the system. Thank you. Thank you.
Or as I like to tell my mother, creating competitive advantages for clients.
That's important. It's easier to get a loan that way, too. So I've been writing bots since about
95. Started out doing remote medicine bots, if you can believe that. I've been involved with
privacy, fraud detection, private investigations. I've done work for foreign
governments. And I've got a fair amount of my business that is with automotive clients. What
makes me a little bit different than, I mean, a lot of people write bots. What makes me a
little different is I actually talk about it. Unfortunately the
only projects I get to talk about are things that are in
house projects that I've been doing. It's really rare that I
get a chance to talk about a specific project that I've done
for a client. But I got permission to talk about this
one. And it came about largely because when my last book was
done, this one, through no starch press, by the way. They
approached no starch, a Linux magazine, and they said can Mike
write an article for us? And I really didn't have anything
ready to write for them. So I approached this old client and I
said, you know, enough time has passed. It's been like six
years. Let me write about this for a change. And they agreed to
let me do this. But that's really key. Because when you've
got a piece of technology that provides a competitive
advantage or allows you to screw with the system strategically,
you don't want to tell people about it, right? Because that's
your, it's a trade secret, really. So if you want to get a
little bit different view of this project, if you can pick up
one of the old copies of Linux magazine, I write about it in a
little bit different way than the way I'm presenting it here
tonight. Okay. What are you going to learn? You're going to
learn what makes a good bot project. I'm going to have to
give you a little bit of insight in how retail automotive works
in order for this whole thing to make sense. You're going to get
an awareness of commercial bots and botnets, and they actually do
exist. And I'm also going to talk a little bit about if I
were to do this again today, how would I do this differently?
Because keep in mind, this happened like six, seven years
ago. So what makes a good bot project? The very first thing you
need to know is that you cannot be afraid to do something
different. Okay? If your company has, you know, you've got a
company that has an Internet strategy, assuming it has an
Internet strategy, that just involves browsers and things you
can do with a browser, you're really missing out. Because you
got the whole big, wide Internet available to you, and
everybody uses the same tool, the browser, right, to access
it. And if you expand your scope a little bit and do things
outside of the way browsers work or do things outside of the
way websites are presented to you, you can create a lot of
really cool things. Okay. Don't assume, just to raise a hand,
how many people here have written a screen scraper? Okay. Cool. How
many people have written a spider? Wow. Cool. Cool. Well,
just, if you've got a client, make sure they realize that just
because you know how to scrape screens, you can write a spider,
it doesn't mean you can make a copy of the Internet. Okay? And
you'd be surprised. I get people approaching me all the time with
ideas for projects. A lot of them, they don't know how to
do it. They basically want to create a copy of the Internet.
So if your project requires both backed processing and real-time
results, you've got a problem. Or if you've got a project that
requires just ridiculous scaling, you've got a problem.
Because unless you've got one of these, your project's going to
fail. You know, you're not going to replicate Google unless
you've got one of these. And then I tell clients after I say,
you know, you really can't do this, it's like, why not? And I
say, well, because Google spends about a million dollars a day on
electricity. That's why. That's why your project's going to fail.
Realize that you don't own, I refer to targets as the subject
server. Don't assume that you own that server. Okay? For
example, I had a potential client approach me a few years
ago, and he wanted to monitor prices on Amazon. About 100,000,
for about 100,000 items. I thought, that sounds really like
a useful thing to do. This guy was a big-time Amazon seller.
Until I found out that he wanted to do this every five
seconds. That's not going to work. It's not going to work.
For lots of reasons. If you did something like this, Amazon would
actually have to build additional infrastructure to
support your project. And you'd end up in court with what they
call a plastic chattels suit. And you want to avoid that. It's
very illegal. Okay. Number four, and this is maybe the most
important thing. You have to have a realistic profit model.
You notice I'm saying profit model and not business model.
Why do I say that? This is why. Okay? And if I'm showing my age
here a little bit, you can look at these. My sales are
growing. I think I made the list twice. I think that's pretty
impressive. That's staying power. So why is it important
that you have a realistic profit model? You know, why is it that
when people approach me and they want to do something that could
just as easily be done on eBay, for example, this is important
because the developer has to get paid. Okay? That's very
important. Okay. About automotive retailing. Just a
little bit here. Without this, the project doesn't make sense.
New car sales are not as profitable as people think they
are. Even if you combine service with that because it's
incredibly capital intensive. And it's super, super
competitive. But you need to have new car sales so you've
got credibility if you want to sell used cars. This is
particularly true if you want to sell high end used cars. Nobody
wants to go to the corner lot for that kind of stuff. The thing
that I learned and I didn't realize, I just assumed that all
the used cars on a car lot were all trade ins. Well, that's not
the case. And it can't be the case because you can't grow a
business if you're going to do that, right? And it's really
limiting. Car dealerships spend tons of money acquiring good
used cars to put on the car lot. And it's kind of bizarre the
way it works because you walk into a car lot and you know
what the price should be for a particular car because it's very
well documented, right? You can go to Kelley Blue Book or any
place. So dealers don't have a lot of space to work on the
price, the final retail price. But down on the wholesale side,
that's where the profits and that's where the margins are
made. If you're good at buying things for a great price, that's
how you make money with used cars. And that's what this
project is about. So a car dealer wants to sell a car
to me. He had this great opportunity. He found this
wonderful website. It was part of the national franchise. They
were getting in used rental cars, two years old, 12 to 16,000
miles, perfect cars that you'd want to have on your lot. Well
maintained. Unfortunately, there was a lot of competition for
these cars because all the people in that dealership chain
wanted the same cars. And the website was horrible and made it
almost mad. So I went to Kelley Blue Book and I found a lot of
good people. I was the one who wanted a car and I said to the
car, I want a car. So my favorite thing about this is
this is aас a lot of frustration. This is kind of the
way it worked. There would be maybe 200 to 300 cars presented
every day and the cars would have little display ads like this
that gave a little bit of a description and there was an
inactive buy now button, okay. And at exactly sale time, that
button would appear. Okay. But the problem with this was that
was it wasn't using Ajax or anything. You had to physically sit and refresh the browser
constantly to get that button to appear. Well, this led to another problem in that there
was incredible server lag. My client, and I think he was probably pretty typical of
all of them in this chain, he would grab every person he could find, people out of parts,
you know, off the sales floor, administrative assistants, set them all in front of computers
and each one of them was assigned maybe about six cars. So they'd have six browser windows
open and they're all sitting there frantically hitting the refresh button constantly. So
if you think about this, okay, so this would have been roughly the equivalent of 36 users
for this one dealership. I don't know, maybe there were 750 dealers that were doing this.
So that was almost 30,000 simultaneous downloads that were happening at sale time. And what
made this worse, I mean, servers should be able to handle that, right? But I think there
was some inefficiency with the database possibly, some bad queries were being made. And this
caused a ridiculous peak in server lag time right at the point where you don't want to
have it. And it would take, you know, it wouldn't be unusual for it to take 15 or 30 seconds
for the screen to refresh at sale time.
Sometimes it would just time out. So this was a real problem.
The other problem is that out of these, say, 200 cars that were up for sale every day,
there were maybe five that every single dealership in the country wanted. Either because they
were the right color, probably because they were a really great price. Or for whatever
reason, I don't know. But every dealership would want these five cars.
So he had a lot of competition for the same cars. Plus server lag, bad web design. He had to
involve a lot of people to do this. So this particular client, I had written a number of bots
for him in the past. And he gave me a call and said, can you help me out, Mike? I said, well,
let's take a look. So the problems were the system was way too manual to begin with. So the way
this would work, he would have to manually run the system. He would have to manually run the system.
He would have to manually go and select the cars that he wanted to buy. He'd have to distribute
the VIN numbers to the various people. He'd have to call people in off of their normal
duties that they would be doing. They'd be dedicating probably a good 15 to 20 minutes
hitting the refresh button every day. So that wasn't good. Plus the buy button took
way too long to appear because of the server lag. So we came up with ‑‑ we ended up
with two solutions. One of them, because ‑‑ because the server lag was so bad, we had to
do a new one because it worked. The second one because we had competition. So let's look at
phase one first here. And again, this is not like classic bot design. And keep in mind, this was
done like six years ago. So I don't develop like this anymore. Okay. So here's what I did. I
came up with a web interface for my client. And if you look here, this is basically just four
HTML frames that were independent from each other. And, you know, they could just go to a URL,
pull this up. And by the way, I say bot net, but this was all done on computers that we
controlled ‑‑ not controlled, we owned. Okay. There's a difference. Right? In fact, all of the
bots that I write, they're all commercial bots. We own all the hardware. Okay. I just want to let
you guys know that. So instead of hauling in all these people to hit the refresh button
constantly while they should be doing something else, my client was able to get a lot of the
information. So I was able to pull up something like this. And quite frequently, he would have
two or three computers set up with this in the browser. And he would just select what cars he
wanted. The first step was to log on. They had several accounts ‑‑ it was a closed sale,
basically. And they had several accounts they could use. So the first thing they would do is
they would pick which account they wanted to use for this particular bot. And the next step was
you would pick the VIN number of the VIN number of the VIN number of the VIN number of the VIN number
of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of the
VIN number of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of
the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of
the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number
of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number of the VIN number
generating a lot of traffic. Really good traffic. So it's
important to validate stuff like that. So as soon as the VIN was
validated, a little start button would appear. So instead of
being right on time when the sale was, you could do this
hours in advance, hit the start button, and then it would start
to count down. Now the way it would do this is it was
basically synchronizing its clock with the server clock of
the sales server. And this was really simple stuff. In the
meta refresh, in the HTML meta refresh, it would just start
refreshing every so often. And it would get, you know, as the
sale got closer and closer, it would refresh more often until
right at the end, it was like right lock step with the server
clock. And as soon as it timed out, it would go ahead and it
would attempt to purchase the car. Now this shows just one
bot clock. It's a little bit more complicated than the other
client. And basically the bot clients acted as triggers for
the server that actually made the purchase. And there may have
been 16 to 30 of these bots running, triggering the server.
Sometimes we'd miss one. But more often, the sale was
successful. And we would send an e-mail confirmation to my client
saying you bought this car. And we would also arrange for
financing for him. And then we would go back to the server and
make sure that the car actually was shipped correctly back to
his dealership. So the bot provided a lot of utility in
that regard. So how successful were we? Well, before, he wasn't
getting anything. And this was really frustrating for him
because these were cars he really wanted and he knew he
could make a profit on them given the price they were
selling for. After, we were getting probably about 95 to 97
percent of the cars he was trying to buy. So the difference was,
it was phenomenal. It was so much fun because even after I was
done developing this, I would get a call every day from my client
15 to 20 minutes after the sale, and he would say, Mike, we bought
five out of six today. We got seven out of seven. We got nine
out of 12. And I'm like, settle down. Don't get greedy here
because, you know, don't kill the golden goose. So why were we
successful at this? Well, the first thing we did was we had a
problem. Well, the main problem with the old one is that people
had to wait for that stupid refresh button or that buy it
now button to happen. And there was so much problem, so much
server lag that that was the problem. And usually whoever got
the buy button first was the person that bought the car. So
basically what we did is we got rid of the buy button. We just
got rid of it. And we replaced it with a timer that was
automated so you didn't need that person to buy the car. And
it would just know what time to buy the car, and it would go ahead
and buy it. This type of a bot is typically called a sniper, if
you've ever heard that term before. And I remember back in
the day when I was doing this, we were testing. And I was going
to write him an e-mail that said something to the effect of, I've
got six snipers waiting to hit cars at noon. And I was going to
say, I'm going to send you a call. And I was just about ready
to send that e-mail. And I started thinking about
carnivore and some of the stuff that was happening back then. And
I thought, no, I'll just give him a call. Today I would never
send an e-mail like that. Never. I'm not even sure I'd make a
phone call. So yeah, watch your language. Okay. So everything
worked great for about six months. And then all of a sudden,
things weren't as rosy anymore. We started not, you know, Mark
would call, excuse me, the client would call, and he would
say, you know, we only got two out of seven today. Something's
wrong. And he did some research. And he discovered through his
connections, he's got lots of connections, that there was a
group of Russian hackers that were hired to write a competing
bot, and they were someplace out in New Jersey, or the
dealership. And they were out in New Jersey, or the dealership.
And they were out in New Jersey, or the dealership was out in New
Jersey, or something. Excuse me? Who, what?
I don't know. No comprendo. So competition is good, right? And
that leads to innovation. And I was kind of thinking, yeah,
let's, this is going to be fun now. We've got an arms race
going on here. So here's part two of the solution. What I did
differently is while I was synchronizing clocks with the
sales server, I started looking at lag time. And I got to the
point where I got really good at estimating how much lag time
there would be at the sale time. So in other words, what I was
essentially doing is I was estimating how many users were
on the system. And with that information, I would not say,
well, I'm going to set one attempt to buy the car, but for
each bot, I would launch maybe between, I forget what the real
number was because I haven't looked at the code for ages, but
I probably launched between five and seven attempts to buy the
car. And based on the amount of lag time that I was going to
anticipate at the sale time, I would launch them just a little
bit before, incrementally before the sale time. And this was real
successful. So now there would be a number of bots that would
launch, and each one of those basically had a warhead that
launched multiple attempts to buy the cars. And so our success
rate prior to making this fix, during the competition, was
about, he was getting about 50% after we were back right on the
money, we were getting every car we wanted. And it stayed that
way through the duration of this program. So how successful was
the bot? These are all games. I'm going to show you how
successful this bot was. I don't have any hard facts here,
but I know it was in operation for about 40 weeks, and they were
buying roughly five cars a day. So it was about 800 cars, I'm
going to estimate, were purchased with this. If you
figure the average wholesale cost of the cars they were
purchasing, it was probably around $16,000. So in a 40-
week period, this bot purchased almost 13 million cars. So this
has a huge impact on a small dealer like this one. So this is
a great example of not accepting the web as it is, not using
browsers the way everybody else would, and doing something
different, and not being afraid to step outside of the box a
little bit. So what would I do differently today if I was going
to do this? First, there were things that were done pretty
well back then. First, there were things that were done
pretty well back then, and things that I still do today. I
really like having very lightweight clients. The lighter
the better. Everything was easily updated because it was
all online, and it was easily distributed. I could make
changes on the server. It would get distributed everywhere
because basically these are just, these bot clients were
essentially just web pages with some JavaScript and stuff going
on. One of the things that I really definitely would do if I
were to do this over is I would definitely do a lot of
build-in, some analytics and collect metrics. So I would
really want to know exactly what our success rate was. I would
want to know exactly how much these cars were purchased for. It
would be really great to also know how much they were sold
for, so it could actually show value. That's something I really
wish I had done. The other thing I think that would have been nice
if I were to do this over again is build in some process that
actually assists a lot of people. So I would build in some
practice and the selection of which vehicles you want to
purchase. So in other words maybe what I would have done is
I would have also had my bot look at Kelley Bluebook and
figure out what the good wholesale prices are for cars
and see, look for discrepancies. Locate the ones that are
underpriced. That would have been a really good thing to do.
The other thing that occurred to me actually within the last week
is probably the only thing that I would have done was have a lot of
thing I really need to do here is make that buy it now button
happen, right? And I could have done that simply by making the
server act kind of like a proxy. So as the HTML is coming in with
the grayed out button, I could have just replaced it with a
real button and sent it off to the browser, right? That
probably would have worked. The problem there is that
conceivably you could have bought cars before the purchase
time. And that may have been allowed, but that's something
you don't want to do for the same reason you don't want to
buy cars that don't exist. You don't want to show your hand.
The website, the target, was a very traditional website. It
used HTML forms, which are really easy for me to emulate or
submit using just PHP and curl. Today you don't find that so
often. You find it on the web, but you don't find it on the
internet. You find a lot of JavaScript, you find a lot of
Ajax. There's a lot of JavaScript validation of form
data before it's submitted. It makes it a lot harder to do this
kind of thing today. So today the kind of approach that I take
now is I end up with a task queue, which is basically a
table in a database that keeps track of what needs to be done.
And there's a web interface into that. So in this particular case
my client would essentially be able to see what's going on in a
task queue. And that task queue would be fed to individual
computers, which I refer to as harvesters. And they can exist
anywhere. They can be in the cloud. They can be in a closet.
They can be in your office. They can be anywhere. And what I have
them do now, since there's so much more complexity in websites
and so much more use of JavaScript, they're able to do a
lot of client-side scripting. I do a lot of stuff in iMacros.
Anybody here use iMacros? It is the most amazing tool. It's just
an add-on for your browser that essentially lets you create a
macro for your browser that you can just play over and over
again. And what I do now is the harvesters will dynamically
create that macro so you can get them to do some very specific
things. Once I learned how to do that, there was not a lot of
time to do that. I was very excited about it. And I had to
make sure that I set up a single website on the planet I could
not manipulate. It was like the gods handing me fire. It's like
here. Here, Mike, you've been a good boy. So that's what I do
now. And so I actually communicate through Firefox. So
it's very easy for me to emulate human activity now with
bots. So I would have them hit the sales server. The difficulty
there would be to get the timing down correctly. But I think that
could have been done. And then the harvesters, after they did
do their thing with the sales server, the target server, they
report back to the bot server and the queue is updated and
that's how you can tell what the results were of what you did.
If you're interested in how that kind of stuff works, go on
YouTube and look up my DEF CON 17 talk because that's all about
manipulating iMacros in that way to do screen scrapers for very
difficult to scrape sites or difficult to automate sites. So
that's my talk. Thanks for all of you for coming. Thank you to
the call for paper goons.
